Overview

Dataset statistics

Number of variables25
Number of observations29285
Missing cells285132
Missing cells (%)38.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.9 MiB
Average record size in memory785.8 B

Variable types

Categorical11
Numeric11
Unsupported3

Alerts

ClaseVehiculo__c has constant value "99999" Constant
TipoVehiculo__c has constant value "99999" Constant
PlacaVehiculo__c has a high cardinality: 8933 distinct values High cardinality
n_prod_prev is highly correlated with total_siniestros and 2 other fieldsHigh correlation
total_siniestros is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with n_prod_prev and 3 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
Activos__c is highly correlated with AnnualRevenueHigh correlation
AnnualRevenue is highly correlated with Activos__c and 1 other fieldsHigh correlation
MontoAnual__c is highly correlated with total_pagado_smmlvHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
total_siniestros is highly correlated with total_pagado_smmlvHigh correlation
total_pagado_smmlv is highly correlated with total_siniestrosHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
total_siniestros is highly correlated with total_pagado_smmlv and 1 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with total_siniestros and 2 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
MontoAnual__c is highly correlated with total_pagado_smmlvHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
TipoVehiculo__c is highly correlated with churn and 8 other fieldsHigh correlation
churn is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
EstadoCivil__pc is highly correlated with TipoVehiculo__c and 3 other fieldsHigh correlation
ClaseVehiculo__c is highly correlated with TipoVehiculo__c and 8 other fieldsHigh correlation
CodigoTipoAsegurado__c is highly correlated with TipoVehiculo__c and 4 other fieldsHigh correlation
ciudad_name is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
tipo_ramo_name is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
Genero__pc is highly correlated with TipoVehiculo__c and 3 other fieldsHigh correlation
tipo_prod_desc is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
FechaInicioVigencia__ctrim is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
PuntoVenta__c is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
tipo_ramo_name is highly correlated with tipo_prod_descHigh correlation
tipo_prod_desc is highly correlated with tipo_ramo_nameHigh correlation
FechaInicioVigencia__ctrim is highly correlated with churn and 2 other fieldsHigh correlation
churn is highly correlated with FechaInicioVigencia__ctrim and 3 other fieldsHigh correlation
n_prod_prev is highly correlated with PuntoVenta__c and 3 other fieldsHigh correlation
total_siniestros is highly correlated with PuntoVenta__c and 4 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with PuntoVenta__c and 4 other fieldsHigh correlation
AnnualRevenue is highly correlated with OtrosIngresos__c and 1 other fieldsHigh correlation
OtrosIngresos__c is highly correlated with AnnualRevenueHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
EstadoCivil__pc is highly correlated with Genero__pcHigh correlation
Genero__pc is highly correlated with EstadoCivil__pcHigh correlation
MarcaVehiculo__c has 29285 (100.0%) missing values Missing
MdeloVehiculo__c has 29285 (100.0%) missing values Missing
PlacaVehiculo__c has 20154 (68.8%) missing values Missing
n_prod_prev has 15104 (51.6%) missing values Missing
total_siniestros has 23226 (79.3%) missing values Missing
total_pagado_smmlv has 23226 (79.3%) missing values Missing
anios_ultimo_siniestro has 23226 (79.3%) missing values Missing
Activos__c has 7741 (26.4%) missing values Missing
AnnualRevenue has 7741 (26.4%) missing values Missing
MontoAnual__c has 29260 (99.9%) missing values Missing
OtrosIngresos__c has 9850 (33.6%) missing values Missing
Profesion__pc has 29285 (100.0%) missing values Missing
EgresosAnuales__c has 7741 (26.4%) missing values Missing
EstadoCivil__pc has 7485 (25.6%) missing values Missing
Genero__pc has 7485 (25.6%) missing values Missing
ciudad_name has 7485 (25.6%) missing values Missing
edad has 7553 (25.8%) missing values Missing
Activos__c is highly skewed (γ1 = 99.98653783) Skewed
AnnualRevenue is highly skewed (γ1 = 25.02132063) Skewed
OtrosIngresos__c is highly skewed (γ1 = 82.17193257) Skewed
EgresosAnuales__c is highly skewed (γ1 = 21.04856233) Skewed
PlacaVehiculo__c is uniformly distributed Uniform
MarcaVehiculo__c is an unsupported type, check if it needs cleaning or further analysis Unsupported
MdeloVehiculo__c is an unsupported type, check if it needs cleaning or further analysis Unsupported
Profesion__pc is an unsupported type, check if it needs cleaning or further analysis Unsupported
total_pagado_smmlv has 829 (2.8%) zeros Zeros
OtrosIngresos__c has 18461 (63.0%) zeros Zeros

Reproduction

Analysis started2022-06-18 20:48:35.686260
Analysis finished2022-06-18 20:49:07.495259
Duration31.81 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

CodigoTipoAsegurado__c
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
1
28552 
4
 
549
3
 
127
2
 
57

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters29285
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
128552
97.5%
4549
 
1.9%
3127
 
0.4%
257
 
0.2%

Length

2022-06-18T15:49:07.538759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:07.611760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
128552
97.5%
4549
 
1.9%
3127
 
0.4%
257
 
0.2%

Most occurring characters

ValueCountFrequency (%)
128552
97.5%
4549
 
1.9%
3127
 
0.4%
257
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number29285
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
128552
97.5%
4549
 
1.9%
3127
 
0.4%
257
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common29285
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
128552
97.5%
4549
 
1.9%
3127
 
0.4%
257
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII29285
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
128552
97.5%
4549
 
1.9%
3127
 
0.4%
257
 
0.2%

PuntoVenta__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct283
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3127.066382
Minimum5
Maximum20007
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:07.685758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile103
Q1902
median2365
Q33303
95-th percentile9721
Maximum20007
Range20002
Interquartile range (IQR)2401

Descriptive statistics

Standard deviation3181.762286
Coefficient of variation (CV)1.017491123
Kurtosis0.1990366466
Mean3127.066382
Median Absolute Deviation (MAD)1317
Skewness1.263262232
Sum91576139
Variance10123611.25
MonotonicityNot monotonic
2022-06-18T15:49:07.775258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
97212524
 
8.6%
33011814
 
6.2%
10481608
 
5.5%
1031350
 
4.6%
24661329
 
4.5%
4041241
 
4.2%
32021215
 
4.1%
35021151
 
3.9%
3011064
 
3.6%
1503951
 
3.2%
Other values (273)15038
51.4%
ValueCountFrequency (%)
5162
 
0.6%
831
 
0.1%
953
 
0.2%
14124
 
0.4%
16119
 
0.4%
23455
1.6%
25120
 
0.4%
2689
 
0.3%
1001
 
< 0.1%
1029
 
< 0.1%
ValueCountFrequency (%)
200071
 
< 0.1%
101112
 
< 0.1%
99778
 
< 0.1%
99741
 
< 0.1%
99731
 
< 0.1%
99723
 
< 0.1%
997172
0.2%
99704
 
< 0.1%
996939
0.1%
99678
 
< 0.1%

tipo_ramo_name
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
automoviles
26359 
previhogar
2830 
responsabilidad civil
 
96

Length

Max length21
Median length11
Mean length10.93614478
Min length10

Characters and Unicode

Total characters320265
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowprevihogar
2nd rowprevihogar
3rd rowprevihogar
4th rowprevihogar
5th rowprevihogar

Common Values

ValueCountFrequency (%)
automoviles26359
90.0%
previhogar2830
 
9.7%
responsabilidad civil96
 
0.3%

Length

2022-06-18T15:49:07.861758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:07.931760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
automoviles26359
89.7%
previhogar2830
 
9.6%
responsabilidad96
 
0.3%
civil96
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o55644
17.4%
i29573
9.2%
a29381
9.2%
e29285
9.1%
v29285
9.1%
l26551
8.3%
s26551
8.3%
m26359
8.2%
t26359
8.2%
u26359
8.2%
Other values (9)14918
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter320169
> 99.9%
Space Separator96
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o55644
17.4%
i29573
9.2%
a29381
9.2%
e29285
9.1%
v29285
9.1%
l26551
8.3%
s26551
8.3%
m26359
8.2%
t26359
8.2%
u26359
8.2%
Other values (8)14822
 
4.6%
Space Separator
ValueCountFrequency (%)
96
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin320169
> 99.9%
Common96
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o55644
17.4%
i29573
9.2%
a29381
9.2%
e29285
9.1%
v29285
9.1%
l26551
8.3%
s26551
8.3%
m26359
8.2%
t26359
8.2%
u26359
8.2%
Other values (8)14822
 
4.6%
Common
ValueCountFrequency (%)
96
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII320265
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o55644
17.4%
i29573
9.2%
a29381
9.2%
e29285
9.1%
v29285
9.1%
l26551
8.3%
s26551
8.3%
m26359
8.2%
t26359
8.2%
u26359
8.2%
Other values (9)14918
 
4.7%

tipo_prod_desc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
automoviles
26359 
previhogar
2830 
profesionales medicos
 
56
directores y administradores
 
38
servidores publicos
 
2

Length

Max length28
Median length11
Mean length10.94509134
Min length10

Characters and Unicode

Total characters320527
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowprevihogar
2nd rowprevihogar
3rd rowprevihogar
4th rowprevihogar
5th rowprevihogar

Common Values

ValueCountFrequency (%)
automoviles26359
90.0%
previhogar2830
 
9.7%
profesionales medicos56
 
0.2%
directores y administradores38
 
0.1%
servidores publicos2
 
< 0.1%

Length

2022-06-18T15:49:07.998758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:08.076758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
automoviles26359
89.6%
previhogar2830
 
9.6%
profesionales56
 
0.2%
medicos56
 
0.2%
directores38
 
0.1%
y38
 
0.1%
administradores38
 
0.1%
servidores2
 
< 0.1%
publicos2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o55796
17.4%
e29475
9.2%
i29419
9.2%
a29321
9.1%
v29191
9.1%
s26647
8.3%
m26453
8.3%
t26435
8.2%
l26417
8.2%
u26361
8.2%
Other values (11)15012
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter320393
> 99.9%
Space Separator134
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o55796
17.4%
e29475
9.2%
i29419
9.2%
a29321
9.2%
v29191
9.1%
s26647
8.3%
m26453
8.3%
t26435
8.3%
l26417
8.2%
u26361
8.2%
Other values (10)14878
 
4.6%
Space Separator
ValueCountFrequency (%)
134
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin320393
> 99.9%
Common134
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o55796
17.4%
e29475
9.2%
i29419
9.2%
a29321
9.2%
v29191
9.1%
s26647
8.3%
m26453
8.3%
t26435
8.3%
l26417
8.2%
u26361
8.2%
Other values (10)14878
 
4.6%
Common
ValueCountFrequency (%)
134
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII320527
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o55796
17.4%
e29475
9.2%
i29419
9.2%
a29321
9.1%
v29191
9.1%
s26647
8.3%
m26453
8.3%
t26435
8.2%
l26417
8.2%
u26361
8.2%
Other values (11)15012
 
4.7%

ClaseVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
99999
29285 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters146425
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
9999929285
100.0%

Length

2022-06-18T15:49:08.145758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:08.208758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
9999929285
100.0%

Most occurring characters

ValueCountFrequency (%)
9146425
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number146425
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9146425
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common146425
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9146425
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII146425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9146425
100.0%

MarcaVehiculo__c
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing29285
Missing (%)100.0%
Memory size228.9 KiB

MdeloVehiculo__c
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing29285
Missing (%)100.0%
Memory size228.9 KiB

PlacaVehiculo__c
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct8933
Distinct (%)97.8%
Missing20154
Missing (%)68.8%
Memory size1.2 MiB
GCZ316
 
3
INW085
 
3
JKS685
 
3
IIL275
 
2
FUR962
 
2
Other values (8928)
9118 

Length

Max length6
Median length6
Mean length5.999452415
Min length5

Characters and Unicode

Total characters54781
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8738 ?
Unique (%)95.7%

Sample

1st rowIVW538
2nd rowTGL728
3rd rowHDQ543
4th rowHZQ669
5th rowINZ881

Common Values

ValueCountFrequency (%)
GCZ3163
 
< 0.1%
INW0853
 
< 0.1%
JKS6853
 
< 0.1%
IIL2752
 
< 0.1%
FUR9622
 
< 0.1%
FWS0372
 
< 0.1%
GVO7652
 
< 0.1%
SWK8212
 
< 0.1%
GFR9512
 
< 0.1%
GYN2452
 
< 0.1%
Other values (8923)9108
31.1%
(Missing)20154
68.8%

Length

2022-06-18T15:49:08.262258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gcz3163
 
< 0.1%
jks6853
 
< 0.1%
inw0853
 
< 0.1%
rlu4262
 
< 0.1%
sqd7302
 
< 0.1%
smw0132
 
< 0.1%
szq3502
 
< 0.1%
xej9682
 
< 0.1%
fjs8982
 
< 0.1%
tsw5722
 
< 0.1%
Other values (8923)9108
99.7%

Most occurring characters

ValueCountFrequency (%)
02848
 
5.2%
22844
 
5.2%
72843
 
5.2%
32770
 
5.1%
12767
 
5.1%
92765
 
5.0%
42753
 
5.0%
82723
 
5.0%
52711
 
4.9%
62688
 
4.9%
Other values (26)27069
49.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number27712
50.6%
Uppercase Letter27069
49.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S2009
 
7.4%
T1669
 
6.2%
W1472
 
5.4%
J1369
 
5.1%
G1314
 
4.9%
U1258
 
4.6%
R1139
 
4.2%
F1116
 
4.1%
M1072
 
4.0%
K1064
 
3.9%
Other values (16)13587
50.2%
Decimal Number
ValueCountFrequency (%)
02848
10.3%
22844
10.3%
72843
10.3%
32770
10.0%
12767
10.0%
92765
10.0%
42753
9.9%
82723
9.8%
52711
9.8%
62688
9.7%

Most occurring scripts

ValueCountFrequency (%)
Common27712
50.6%
Latin27069
49.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
S2009
 
7.4%
T1669
 
6.2%
W1472
 
5.4%
J1369
 
5.1%
G1314
 
4.9%
U1258
 
4.6%
R1139
 
4.2%
F1116
 
4.1%
M1072
 
4.0%
K1064
 
3.9%
Other values (16)13587
50.2%
Common
ValueCountFrequency (%)
02848
10.3%
22844
10.3%
72843
10.3%
32770
10.0%
12767
10.0%
92765
10.0%
42753
9.9%
82723
9.8%
52711
9.8%
62688
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII54781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02848
 
5.2%
22844
 
5.2%
72843
 
5.2%
32770
 
5.1%
12767
 
5.1%
92765
 
5.0%
42753
 
5.0%
82723
 
5.0%
52711
 
4.9%
62688
 
4.9%
Other values (26)27069
49.4%

TipoVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
99999
29285 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters146425
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
9999929285
100.0%

Length

2022-06-18T15:49:08.326758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:08.612759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
9999929285
100.0%

Most occurring characters

ValueCountFrequency (%)
9146425
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number146425
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9146425
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common146425
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9146425
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII146425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9146425
100.0%

FechaInicioVigencia__ctrim
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
03-2020
4404 
02-2021
4031 
02-2018
4024 
03-2018
3985 
01-2018
3739 
Other values (6)
9102 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters204995
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row01-2018
2nd row01-2018
3rd row01-2018
4th row01-2018
5th row01-2018

Common Values

ValueCountFrequency (%)
03-20204404
15.0%
02-20214031
13.8%
02-20184024
13.7%
03-20183985
13.6%
01-20183739
12.8%
02-20193659
12.5%
01-20193598
12.3%
01-20211810
6.2%
03-201926
 
0.1%
02-20207
 
< 0.1%

Length

2022-06-18T15:49:08.665259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
03-20204404
15.0%
02-20214031
13.8%
02-20184024
13.7%
03-20183985
13.6%
01-20183739
12.8%
02-20193659
12.5%
01-20193598
12.3%
01-20211810
6.2%
03-201926
 
0.1%
02-20207
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
062983
30.7%
251260
25.0%
134021
16.6%
-29285
14.3%
811748
 
5.7%
38415
 
4.1%
97283
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number175710
85.7%
Dash Punctuation29285
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
062983
35.8%
251260
29.2%
134021
19.4%
811748
 
6.7%
38415
 
4.8%
97283
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
-29285
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common204995
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
062983
30.7%
251260
25.0%
134021
16.6%
-29285
14.3%
811748
 
5.7%
38415
 
4.1%
97283
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII204995
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
062983
30.7%
251260
25.0%
134021
16.6%
-29285
14.3%
811748
 
5.7%
38415
 
4.1%
97283
 
3.6%

churn
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
1
20018 
0
9267 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters29285
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
120018
68.4%
09267
31.6%

Length

2022-06-18T15:49:08.730258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:08.795258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
120018
68.4%
09267
31.6%

Most occurring characters

ValueCountFrequency (%)
120018
68.4%
09267
31.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number29285
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
120018
68.4%
09267
31.6%

Most occurring scripts

ValueCountFrequency (%)
Common29285
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
120018
68.4%
09267
31.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII29285
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
120018
68.4%
09267
31.6%

n_prod_prev
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12
Distinct (%)0.1%
Missing15104
Missing (%)51.6%
Infinite0
Infinite (%)0.0%
Mean2.867357732
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:08.846258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile8
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.232716874
Coefficient of variation (CV)1.127420146
Kurtosis8.098201626
Mean2.867357732
Median Absolute Deviation (MAD)1
Skewness2.763255298
Sum40662
Variance10.45045838
MonotonicityNot monotonic
2022-06-18T15:49:08.904758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
16999
23.9%
22720
 
9.3%
52582
 
8.8%
3695
 
2.4%
16524
 
1.8%
8278
 
0.9%
4237
 
0.8%
1457
 
0.2%
1033
 
0.1%
1328
 
0.1%
Other values (2)28
 
0.1%
(Missing)15104
51.6%
ValueCountFrequency (%)
16999
23.9%
22720
 
9.3%
3695
 
2.4%
4237
 
0.8%
52582
 
8.8%
616
 
0.1%
712
 
< 0.1%
8278
 
0.9%
1033
 
0.1%
1328
 
0.1%
ValueCountFrequency (%)
16524
 
1.8%
1457
 
0.2%
1328
 
0.1%
1033
 
0.1%
8278
 
0.9%
712
 
< 0.1%
616
 
0.1%
52582
8.8%
4237
 
0.8%
3695
 
2.4%

total_siniestros
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct86
Distinct (%)1.4%
Missing23226
Missing (%)79.3%
Infinite0
Infinite (%)0.0%
Mean1296.323651
Minimum1
Maximum3469
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:08.984758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median114
Q32712
95-th percentile3469
Maximum3469
Range3468
Interquartile range (IQR)2711

Descriptive statistics

Standard deviation1519.967527
Coefficient of variation (CV)1.172521635
Kurtosis-1.737231028
Mean1296.323651
Median Absolute Deviation (MAD)113
Skewness0.4154775249
Sum7854425
Variance2310301.283
MonotonicityNot monotonic
2022-06-18T15:49:09.072758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11535
 
5.2%
27121141
 
3.9%
3383907
 
3.1%
3469444
 
1.5%
137401
 
1.4%
2383
 
1.3%
4340
 
1.2%
3178
 
0.6%
114124
 
0.4%
572
 
0.2%
Other values (76)534
 
1.8%
(Missing)23226
79.3%
ValueCountFrequency (%)
11535
5.2%
2383
 
1.3%
3178
 
0.6%
4340
 
1.2%
572
 
0.2%
641
 
0.1%
753
 
0.2%
831
 
0.1%
920
 
0.1%
1025
 
0.1%
ValueCountFrequency (%)
3469444
 
1.5%
3383907
3.1%
27121141
3.9%
14035
 
< 0.1%
130311
 
< 0.1%
7671
 
< 0.1%
7041
 
< 0.1%
6721
 
< 0.1%
59456
 
0.2%
5321
 
< 0.1%

total_pagado_smmlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct1376
Distinct (%)22.7%
Missing23226
Missing (%)79.3%
Infinite0
Infinite (%)0.0%
Mean9170.924876
Minimum0
Maximum25003.07456
Zeros829
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:09.165258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13.333333761
median750.7467188
Q318999.25909
95-th percentile25003.07456
Maximum25003.07456
Range25003.07456
Interquartile range (IQR)18995.92576

Descriptive statistics

Standard deviation10730.94338
Coefficient of variation (CV)1.170104818
Kurtosis-1.718629928
Mean9170.924876
Median Absolute Deviation (MAD)750.7467188
Skewness0.4227193829
Sum55566633.82
Variance115153145.8
MonotonicityNot monotonic
2022-06-18T15:49:09.256258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18999.259091141
 
3.9%
23810.4818907
 
3.1%
0829
 
2.8%
25003.07456444
 
1.5%
1308.359896401
 
1.4%
1.318843012226
 
0.8%
750.7467188123
 
0.4%
4386.1212356
 
0.2%
41.5516750320
 
0.1%
456.456414118
 
0.1%
Other values (1366)1894
 
6.5%
(Missing)23226
79.3%
ValueCountFrequency (%)
0829
2.8%
0.011439345982
 
< 0.1%
0.058392310332
 
< 0.1%
0.065607814381
 
< 0.1%
0.066517251211
 
< 0.1%
0.068072377061
 
< 0.1%
0.068685503341
 
< 0.1%
0.08072274661
 
< 0.1%
0.082080840511
 
< 0.1%
0.098305006641
 
< 0.1%
ValueCountFrequency (%)
25003.07456444
 
1.5%
23810.4818907
3.1%
18999.259091141
3.9%
10127.778455
 
< 0.1%
9291.8917411
 
< 0.1%
8747.1641041
 
< 0.1%
5779.8093771
 
< 0.1%
4386.1212356
 
0.2%
2952.8700611
 
< 0.1%
2501.8172041
 
< 0.1%

anios_ultimo_siniestro
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct853
Distinct (%)14.1%
Missing23226
Missing (%)79.3%
Infinite0
Infinite (%)0.0%
Mean0.4343833582
Minimum0.002739726027
Maximum17.61917808
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:09.346258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.002739726027
5-th percentile0.002739726027
Q10.002739726027
median0.002739726027
Q30.6232876712
95-th percentile2.016438356
Maximum17.61917808
Range17.61643836
Interquartile range (IQR)0.6205479452

Descriptive statistics

Standard deviation0.8429383817
Coefficient of variation (CV)1.94054023
Kurtosis46.74859616
Mean0.4343833582
Median Absolute Deviation (MAD)0
Skewness4.636355761
Sum2631.928767
Variance0.7105451154
MonotonicityNot monotonic
2022-06-18T15:49:09.436759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0027397260273061
 
10.5%
0.9890410959227
 
0.8%
0.008219178082148
 
0.5%
0.435616438423
 
0.1%
0.0465753424722
 
0.1%
0.0575342465820
 
0.1%
0.0821917808219
 
0.1%
0.0547945205518
 
0.1%
0.205479452117
 
0.1%
0.090410958917
 
0.1%
Other values (843)2487
 
8.5%
(Missing)23226
79.3%
ValueCountFrequency (%)
0.0027397260273061
10.5%
0.00547945205514
 
< 0.1%
0.008219178082148
 
0.5%
0.0109589041112
 
< 0.1%
0.0136986301415
 
0.1%
0.0164383561615
 
0.1%
0.0191780821910
 
< 0.1%
0.0219178082214
 
< 0.1%
0.024657534256
 
< 0.1%
0.0273972602717
 
0.1%
ValueCountFrequency (%)
17.619178081
< 0.1%
9.9205479451
< 0.1%
9.0219178081
< 0.1%
8.9534246581
< 0.1%
8.6986301371
< 0.1%
8.3890410961
< 0.1%
8.2520547951
< 0.1%
7.9178082191
< 0.1%
7.5123287671
< 0.1%
7.0849315071
< 0.1%

Activos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct3955
Distinct (%)18.4%
Missing7741
Missing (%)26.4%
Infinite0
Infinite (%)0.0%
Mean422330779.4
Minimum0
Maximum1 × 1012
Zeros21
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:09.536758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22000000
Q160000000
median120000000
Q3257407500
95-th percentile990926950
Maximum1 × 1012
Range1 × 1012
Interquartile range (IQR)197407500

Descriptive statistics

Standard deviation9756606654
Coefficient of variation (CV)23.10181292
Kurtosis10229.14145
Mean422330779.4
Median Absolute Deviation (MAD)80000000
Skewness99.98653783
Sum9.09869431 × 1012
Variance9.51913734 × 1019
MonotonicityNot monotonic
2022-06-18T15:49:09.628258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000001079
 
3.7%
80000000816
 
2.8%
150000000766
 
2.6%
200000000760
 
2.6%
50000000655
 
2.2%
120000000590
 
2.0%
60000000566
 
1.9%
40000000478
 
1.6%
30000000449
 
1.5%
90000000447
 
1.5%
Other values (3945)14938
51.0%
(Missing)7741
26.4%
ValueCountFrequency (%)
021
 
0.1%
172
0.2%
23
 
< 0.1%
202
 
< 0.1%
403
 
< 0.1%
501
 
< 0.1%
801
 
< 0.1%
1003
 
< 0.1%
1041
 
< 0.1%
3501
 
< 0.1%
ValueCountFrequency (%)
1 × 10122
< 0.1%
1 × 10112
< 0.1%
5.835 × 10101
 
< 0.1%
5.43154931 × 10101
 
< 0.1%
4.479 × 10101
 
< 0.1%
4.0078653 × 10103
< 0.1%
3.1092794 × 10103
< 0.1%
2.8125029 × 10102
< 0.1%
2.43 × 10101
 
< 0.1%
2.0615603 × 10101
 
< 0.1%

AnnualRevenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct3904
Distinct (%)18.1%
Missing7741
Missing (%)26.4%
Infinite0
Infinite (%)0.0%
Mean215724725.8
Minimum0
Maximum8.63 × 1010
Zeros8
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:09.720758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8500000
Q125200000
median43375000
Q380000000
95-th percentile479991450
Maximum8.63 × 1010
Range8.63 × 1010
Interquartile range (IQR)54800000

Descriptive statistics

Standard deviation1593181656
Coefficient of variation (CV)7.385252895
Kurtosis891.9396781
Mean215724725.8
Median Absolute Deviation (MAD)21625000
Skewness25.02132063
Sum4.647573494 × 1012
Variance2.538227789 × 1018
MonotonicityNot monotonic
2022-06-18T15:49:09.809758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
360000001115
 
3.8%
60000000874
 
3.0%
24000000839
 
2.9%
30000000796
 
2.7%
48000000674
 
2.3%
18000000464
 
1.6%
12000000402
 
1.4%
50000000401
 
1.4%
40000000395
 
1.3%
42000000326
 
1.1%
Other values (3894)15258
52.1%
(Missing)7741
26.4%
ValueCountFrequency (%)
08
 
< 0.1%
127
0.1%
202
 
< 0.1%
522301
 
< 0.1%
2500001
 
< 0.1%
3280001
 
< 0.1%
3500001
 
< 0.1%
5000003
 
< 0.1%
6000001
 
< 0.1%
7000003
 
< 0.1%
ValueCountFrequency (%)
8.63 × 10101
 
< 0.1%
6.6728 × 10101
 
< 0.1%
6 × 10102
 
< 0.1%
4.1610143 × 10102
 
< 0.1%
3.6110425 × 10105
< 0.1%
3.5084 × 10102
 
< 0.1%
2.6097952 × 10102
 
< 0.1%
2.576590619 × 10102
 
< 0.1%
2.469016564 × 10103
< 0.1%
2.3626255 × 10102
 
< 0.1%

MontoAnual__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12
Distinct (%)48.0%
Missing29260
Missing (%)99.9%
Infinite0
Infinite (%)0.0%
Mean2005177.64
Minimum0
Maximum50000000
Zeros10
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:09.886758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median100
Q310000
95-th percentile24321
Maximum50000000
Range50000000
Interquartile range (IQR)10000

Descriptive statistics

Standard deviation9998924.797
Coefficient of variation (CV)4.98655311
Kurtosis24.99996075
Mean2005177.64
Median Absolute Deviation (MAD)100
Skewness4.99999434
Sum50129441
Variance9.997849709 × 1013
MonotonicityNot monotonic
2022-06-18T15:49:09.950759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
010
 
< 0.1%
30003
 
< 0.1%
216052
 
< 0.1%
180002
 
< 0.1%
100001
 
< 0.1%
55251
 
< 0.1%
6001
 
< 0.1%
500000001
 
< 0.1%
250001
 
< 0.1%
51
 
< 0.1%
Other values (2)2
 
< 0.1%
(Missing)29260
99.9%
ValueCountFrequency (%)
010
< 0.1%
11
 
< 0.1%
51
 
< 0.1%
1001
 
< 0.1%
6001
 
< 0.1%
30003
 
< 0.1%
55251
 
< 0.1%
100001
 
< 0.1%
180002
 
< 0.1%
216052
 
< 0.1%
ValueCountFrequency (%)
500000001
 
< 0.1%
250001
 
< 0.1%
216052
< 0.1%
180002
< 0.1%
100001
 
< 0.1%
55251
 
< 0.1%
30003
< 0.1%
6001
 
< 0.1%
1001
 
< 0.1%
51
 
< 0.1%

OtrosIngresos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct292
Distinct (%)1.5%
Missing9850
Missing (%)33.6%
Infinite0
Infinite (%)0.0%
Mean2736204.184
Minimum0
Maximum8400000000
Zeros18461
Zeros (%)63.0%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:10.030259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile24000
Maximum8400000000
Range8400000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation75798999.89
Coefficient of variation (CV)27.70224544
Kurtosis8236.986482
Mean2736204.184
Median Absolute Deviation (MAD)0
Skewness82.17193257
Sum5.317812832 × 1010
Variance5.745488384 × 1015
MonotonicityNot monotonic
2022-06-18T15:49:10.119759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
018461
63.0%
1200000053
 
0.2%
1000000038
 
0.1%
600000033
 
0.1%
2400000031
 
0.1%
3000000031
 
0.1%
500000026
 
0.1%
1800000022
 
0.1%
2000000019
 
0.1%
1500000017
 
0.1%
Other values (282)704
 
2.4%
(Missing)9850
33.6%
ValueCountFrequency (%)
018461
63.0%
95831
 
< 0.1%
240002
 
< 0.1%
1780002
 
< 0.1%
1830001
 
< 0.1%
2000006
 
< 0.1%
2011052
 
< 0.1%
2280001
 
< 0.1%
2390001
 
< 0.1%
2400002
 
< 0.1%
ValueCountFrequency (%)
84000000001
 
< 0.1%
34390000002
< 0.1%
16648010004
< 0.1%
9360540001
 
< 0.1%
9287370001
 
< 0.1%
8624235002
< 0.1%
5909600002
< 0.1%
4368380001
 
< 0.1%
3981889681
 
< 0.1%
2683460003
< 0.1%

Profesion__pc
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing29285
Missing (%)100.0%
Memory size228.9 KiB

EgresosAnuales__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct2996
Distinct (%)13.9%
Missing7741
Missing (%)26.4%
Infinite0
Infinite (%)0.0%
Mean138388559.1
Minimum0
Maximum3.6967344 × 1010
Zeros15
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:10.210759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3000000
Q112000000
median24000000
Q348000000
95-th percentile350000000
Maximum3.6967344 × 1010
Range3.6967344 × 1010
Interquartile range (IQR)36000000

Descriptive statistics

Standard deviation976454775.4
Coefficient of variation (CV)7.055892349
Kurtosis570.954123
Mean138388559.1
Median Absolute Deviation (MAD)14000000
Skewness21.04856233
Sum2.981443117 × 1012
Variance9.534639283 × 1017
MonotonicityNot monotonic
2022-06-18T15:49:10.302259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120000001137
 
3.9%
30000000948
 
3.2%
24000000913
 
3.1%
18000000857
 
2.9%
20000000838
 
2.9%
36000000514
 
1.8%
15000000494
 
1.7%
10000000493
 
1.7%
40000000488
 
1.7%
25000000427
 
1.5%
Other values (2986)14435
49.3%
(Missing)7741
26.4%
ValueCountFrequency (%)
015
 
0.1%
1148
0.5%
108
 
< 0.1%
181
 
< 0.1%
1001
 
< 0.1%
2041
 
< 0.1%
200001
 
< 0.1%
500002
 
< 0.1%
700001
 
< 0.1%
720003
 
< 0.1%
ValueCountFrequency (%)
3.6967344 × 10102
 
< 0.1%
3.0868341 × 10105
< 0.1%
2.2582482 × 10102
 
< 0.1%
2.166849912 × 10104
< 0.1%
2.1322738 × 10102
 
< 0.1%
1.9746108 × 10102
 
< 0.1%
1.923885007 × 10103
< 0.1%
1.8035021 × 10101
 
< 0.1%
1.629516502 × 10101
 
< 0.1%
1.4868 × 10102
 
< 0.1%

EstadoCivil__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing7485
Missing (%)25.6%
Memory size1.5 MiB
SOLTERO
10290 
CASADO
9471 
OTRO
1546 
UNIDO
 
294
VIUDO
 
74
Other values (3)
 
125

Length

Max length10
Median length8
Mean length6.325825688
Min length3

Characters and Unicode

Total characters137903
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSOLTERO
2nd rowCASADO
3rd rowSOLTERO
4th rowCASADO
5th rowSOLTERO

Common Values

ValueCountFrequency (%)
SOLTERO10290
35.1%
CASADO9471
32.3%
OTRO1546
 
5.3%
UNIDO294
 
1.0%
VIUDO74
 
0.3%
SEPARADO68
 
0.2%
DIVORCIADO44
 
0.2%
N A13
 
< 0.1%
(Missing)7485
25.6%

Length

2022-06-18T15:49:10.393259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:10.480260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
soltero10290
47.2%
casado9471
43.4%
otro1546
 
7.1%
unido294
 
1.3%
viudo74
 
0.3%
separado68
 
0.3%
divorciado44
 
0.2%
n13
 
0.1%
a13
 
0.1%

Most occurring characters

ValueCountFrequency (%)
O33667
24.4%
S19829
14.4%
A19135
13.9%
R11948
 
8.7%
T11836
 
8.6%
E10358
 
7.5%
L10290
 
7.5%
D9995
 
7.2%
C9515
 
6.9%
I456
 
0.3%
Other values (5)874
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter137890
> 99.9%
Space Separator13
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O33667
24.4%
S19829
14.4%
A19135
13.9%
R11948
 
8.7%
T11836
 
8.6%
E10358
 
7.5%
L10290
 
7.5%
D9995
 
7.2%
C9515
 
6.9%
I456
 
0.3%
Other values (4)861
 
0.6%
Space Separator
ValueCountFrequency (%)
13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin137890
> 99.9%
Common13
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O33667
24.4%
S19829
14.4%
A19135
13.9%
R11948
 
8.7%
T11836
 
8.6%
E10358
 
7.5%
L10290
 
7.5%
D9995
 
7.2%
C9515
 
6.9%
I456
 
0.3%
Other values (4)861
 
0.6%
Common
ValueCountFrequency (%)
13
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII137903
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O33667
24.4%
S19829
14.4%
A19135
13.9%
R11948
 
8.7%
T11836
 
8.6%
E10358
 
7.5%
L10290
 
7.5%
D9995
 
7.2%
C9515
 
6.9%
I456
 
0.3%
Other values (5)874
 
0.6%

Genero__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing7485
Missing (%)25.6%
Memory size1.6 MiB
MASCULINO
16001 
FEMENINO
5792 
N A
 
7

Length

Max length9
Median length9
Mean length8.732385321
Min length3

Characters and Unicode

Total characters190366
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMASCULINO
2nd rowFEMENINO
3rd rowMASCULINO
4th rowMASCULINO
5th rowFEMENINO

Common Values

ValueCountFrequency (%)
MASCULINO16001
54.6%
FEMENINO5792
 
19.8%
N A7
 
< 0.1%
(Missing)7485
25.6%

Length

2022-06-18T15:49:10.558758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-18T15:49:10.634260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
masculino16001
73.4%
femenino5792
 
26.6%
n7
 
< 0.1%
a7
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N27592
14.5%
M21793
11.4%
I21793
11.4%
O21793
11.4%
A16008
8.4%
S16001
8.4%
C16001
8.4%
U16001
8.4%
L16001
8.4%
E11584
6.1%
Other values (2)5799
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter190359
> 99.9%
Space Separator7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N27592
14.5%
M21793
11.4%
I21793
11.4%
O21793
11.4%
A16008
8.4%
S16001
8.4%
C16001
8.4%
U16001
8.4%
L16001
8.4%
E11584
6.1%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin190359
> 99.9%
Common7
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N27592
14.5%
M21793
11.4%
I21793
11.4%
O21793
11.4%
A16008
8.4%
S16001
8.4%
C16001
8.4%
U16001
8.4%
L16001
8.4%
E11584
6.1%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII190366
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N27592
14.5%
M21793
11.4%
I21793
11.4%
O21793
11.4%
A16008
8.4%
S16001
8.4%
C16001
8.4%
U16001
8.4%
L16001
8.4%
E11584
6.1%
Other values (2)5799
 
3.0%

ciudad_name
Categorical

HIGH CORRELATION
MISSING

Distinct22
Distinct (%)0.1%
Missing7485
Missing (%)25.6%
Memory size1.6 MiB
otras
17023 
BOGOTÁ D.C.
 
1549
MEDELLIN
 
674
CALI
 
651
CARTAGENA
 
189
Other values (17)
1714 

Length

Max length13
Median length5
Mean length5.77766055
Min length4

Characters and Unicode

Total characters125953
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowotras
2nd rowotras
3rd rowotras
4th rowPASTO
5th rowotras

Common Values

ValueCountFrequency (%)
otras17023
58.1%
BOGOTÁ D.C.1549
 
5.3%
MEDELLIN674
 
2.3%
CALI651
 
2.2%
CARTAGENA189
 
0.6%
VILLAVICENCIO182
 
0.6%
ARMENIA176
 
0.6%
BUCARAMANGA174
 
0.6%
PASTO167
 
0.6%
MANIZALES160
 
0.5%
Other values (12)855
 
2.9%
(Missing)7485
25.6%

Length

2022-06-18T15:49:10.702761image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
otras17023
72.5%
bogotá1549
 
6.6%
d.c1549
 
6.6%
medellin674
 
2.9%
cali651
 
2.8%
cartagena189
 
0.8%
villavicencio182
 
0.8%
armenia176
 
0.7%
bucaramanga174
 
0.7%
pasto167
 
0.7%
Other values (17)1152
 
4.9%

Most occurring characters

ValueCountFrequency (%)
o17023
13.5%
r17023
13.5%
a17023
13.5%
s17023
13.5%
t17023
13.5%
A4078
 
3.2%
O3823
 
3.0%
C3240
 
2.6%
.3098
 
2.5%
L2975
 
2.4%
Other values (21)23624
18.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter85115
67.6%
Uppercase Letter36054
28.6%
Other Punctuation3098
 
2.5%
Space Separator1686
 
1.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A4078
11.3%
O3823
10.6%
C3240
 
9.0%
L2975
 
8.3%
I2682
 
7.4%
E2480
 
6.9%
D2353
 
6.5%
T2070
 
5.7%
G2019
 
5.6%
N1918
 
5.3%
Other values (14)8416
23.3%
Lowercase Letter
ValueCountFrequency (%)
o17023
20.0%
r17023
20.0%
a17023
20.0%
s17023
20.0%
t17023
20.0%
Other Punctuation
ValueCountFrequency (%)
.3098
100.0%
Space Separator
ValueCountFrequency (%)
1686
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin121169
96.2%
Common4784
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
o17023
14.0%
r17023
14.0%
a17023
14.0%
s17023
14.0%
t17023
14.0%
A4078
 
3.4%
O3823
 
3.2%
C3240
 
2.7%
L2975
 
2.5%
I2682
 
2.2%
Other values (19)19256
15.9%
Common
ValueCountFrequency (%)
.3098
64.8%
1686
35.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII124160
98.6%
None1793
 
1.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o17023
13.7%
r17023
13.7%
a17023
13.7%
s17023
13.7%
t17023
13.7%
A4078
 
3.3%
O3823
 
3.1%
C3240
 
2.6%
.3098
 
2.5%
L2975
 
2.4%
Other values (18)21831
17.6%
None
ValueCountFrequency (%)
Á1628
90.8%
Ú112
 
6.2%
É53
 
3.0%

edad
Real number (ℝ≥0)

MISSING

Distinct10412
Distinct (%)47.9%
Missing7553
Missing (%)25.8%
Infinite0
Infinite (%)0.0%
Mean50.3867382
Minimum1.42739726
Maximum122.5424658
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size228.9 KiB
2022-06-18T15:49:10.781259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1.42739726
5-th percentile29.1509589
Q139.74794521
median49.54520548
Q360.25068493
95-th percentile73.98082192
Maximum122.5424658
Range121.1150685
Interquartile range (IQR)20.50273973

Descriptive statistics

Standard deviation14.19263628
Coefficient of variation (CV)0.2816740435
Kurtosis0.3553377722
Mean50.3867382
Median Absolute Deviation (MAD)10.24931507
Skewness0.3639585464
Sum1095004.595
Variance201.4309247
MonotonicityNot monotonic
2022-06-18T15:49:10.877259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46.7205479580
 
0.3%
47.5479452177
 
0.3%
42.4904109641
 
0.1%
122.542465827
 
0.1%
43.7178082222
 
0.1%
36.5397260320
 
0.1%
47.7205479516
 
0.1%
71.5561643814
 
< 0.1%
53.4958904113
 
< 0.1%
41.6136986313
 
< 0.1%
Other values (10402)21409
73.1%
(Missing)7553
 
25.8%
ValueCountFrequency (%)
1.427397261
< 0.1%
3.4054794521
< 0.1%
3.4383561641
< 0.1%
4.2630136991
< 0.1%
4.2876712331
< 0.1%
4.3671232881
< 0.1%
4.3808219182
< 0.1%
4.427397261
< 0.1%
4.4602739732
< 0.1%
4.7945205481
< 0.1%
ValueCountFrequency (%)
122.542465827
0.1%
104.98630142
 
< 0.1%
103.10410963
 
< 0.1%
100.04931511
 
< 0.1%
98.528767122
 
< 0.1%
96.394520552
 
< 0.1%
96.32602742
 
< 0.1%
96.13698631
 
< 0.1%
95.547945212
 
< 0.1%
95.465753422
 
< 0.1%

Interactions

2022-06-18T15:49:04.059258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:39.253259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:46.876759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.966260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.876259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.554759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:54.224758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:56.633759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.798259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.493258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.620758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.471259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:41.407258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.165258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:49.841758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:51.778260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.428258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:55.658259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.040761image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.872258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:00.870258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.231258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.547258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:42.008258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.250759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:49.924258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:51.858260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.508759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:55.734759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.120259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.937758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:00.946258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.311258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.625760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:42.438258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.337260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.228759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:51.938758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.595258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:55.813258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.198258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.998259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.022758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.396260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.697758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:42.860259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.415758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.309258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.015258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.675258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:55.885258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.270759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.054759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.093258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.476258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.776258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:43.299258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.495259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.393759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.097258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.757759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:55.964259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.350258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.113259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.169758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.561758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.856259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:44.133760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.576760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.474259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.177258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.833259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:56.042758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.429759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.171759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.246258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.645760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.928758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:44.789258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.658259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.553759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.255760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.907258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:56.115758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.499759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.234258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.316758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.722258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:05.992258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:44.885758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.724759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.619258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.319259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:53.967258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:56.180259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.566258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.294258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.379758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.793259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:06.068758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:45.541758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.800258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.698758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.393758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:54.050758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:56.462758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.640758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.355259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.453758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.873758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:06.154258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:46.211759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:48.883759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:50.789762image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:52.476257image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:54.136758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:56.546758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:58.723758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:48:59.422759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:01.535258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-18T15:49:03.964759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-06-18T15:49:10.960758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-06-18T15:49:11.110759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-06-18T15:49:11.249259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-06-18T15:49:11.380258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-06-18T15:49:11.502259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-06-18T15:49:06.344758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-06-18T15:49:06.849260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-06-18T15:49:07.145258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-06-18T15:49:07.366758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_ramo_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cPlacaVehiculo__cTipoVehiculo__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcciudad_nameedad
01404previhogarprevihogar99999NaNNaNNaN9999901-20180NaNNaNNaNNaN1.320000e+087.000000e+05NaNNaNNaN1.000000e+00SOLTEROMASCULINOotras49.257534
11103previhogarprevihogar99999NaNNaNNaN9999901-20180NaNNaNNaNNaN3.000000e+072.090000e+07NaNNaNNaN1.820000e+07CASADOFEMENINOotras83.991781
215previhogarprevihogar99999NaNNaNNaN9999901-20181NaNNaNNaNNaN3.434361e+097.238820e+08NaN0.0NaN5.662310e+08SOLTEROMASCULINOotras37.389041
311402previhogarprevihogar99999NaNNaNNaN9999901-201812.0NaNNaNNaN4.260000e+081.123034e+08NaN100000000.0NaN9.600000e+07CASADOMASCULINOPASTO50.161644
417002previhogarprevihogar99999NaNNaNNaN9999901-20181NaNNaNNaNNaN6.000000e+072.900000e+07NaNNaNNaN2.900000e+07SOLTEROFEMENINOotras37.106849
517007previhogarprevihogar99999NaNNaNNaN9999901-201811.0NaNNaNNaN2.068315e+087.445627e+07NaN0.0NaN6.645600e+07OTROMASCULINOBOGOTÁ D.C.38.383562
61404previhogarprevihogar99999NaNNaNNaN9999901-20180NaNNaNNaNNaN6.000000e+084.900000e+07NaNNaNNaN3.500000e+07OTROFEMENINOotras78.463014
71404previhogarprevihogar99999NaNNaNNaN9999901-20181NaNNaNNaNNaN1.700000e+083.600000e+07NaNNaNNaN2.400000e+07OTROFEMENINOCALI72.504110
81404previhogarprevihogar99999NaNNaNNaN9999901-20180NaNNaNNaNNaN1.360000e+081.450000e+07NaN9000000.0NaN2.350000e+07CASADOMASCULINOCALI82.301370
91404previhogarprevihogar99999NaNNaNNaN9999901-20180NaNNaNNaNNaN2.361596e+092.776884e+09NaNNaNNaN1.853277e+09OTROMASCULINOCALI63.627397

Last rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_ramo_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cPlacaVehiculo__cTipoVehiculo__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcciudad_nameedad
2927513301responsabilidad civilprofesionales medicos99999NaNNaNNaN9999902-20211NaNNaNNaNNaN324578000.083160000.0NaN0.0NaN1.0OTROMASCULINOotras61.082192
2927613301responsabilidad civildirectores y administradores99999NaNNaNNaN9999902-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2927711048responsabilidad civildirectores y administradores99999NaNNaNNaN9999902-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2927811048responsabilidad civildirectores y administradores99999NaNNaNNaN9999902-20211NaN2.01.4151210.397260NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2927913301responsabilidad civilprofesionales medicos99999NaNNaNNaN9999902-20211NaNNaNNaNNaN18000000.060000000.0NaN0.0NaN58000000.0CASADOMASCULINOBOGOTÁ D.C.NaN
2928013301responsabilidad civildirectores y administradores99999NaNNaNNaN9999902-2021114.0704.05779.8093770.063014NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2928133202responsabilidad civilprofesionales medicos99999NaNNaNNaN9999902-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2928218001responsabilidad civilprofesionales medicos99999NaNNaNNaN9999902-20211NaNNaNNaNNaN500000000.090000000.0NaN0.0NaN65000000.0SOLTEROMASCULINOBOGOTÁ D.C.122.542466
2928318001responsabilidad civildirectores y administradores99999NaNNaNNaN9999902-20210NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2928418001responsabilidad civildirectores y administradores99999NaNNaNNaN9999902-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN